Classical simulability and the significance of modular exponentiation in Shor's 

algorithm 
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We show that a classical algorithm efficiently simulating the modular exponentiation circuit, for 
certain product state input and with measurements in a general product state basis at the output, 
can efficiently simulate Shor's factoring algorithm. This is done by using the notion of the semi- 
classical Fourier transform due to Griffith and Niu, and further discussed in the context of Shor's 
algorithm by Browne. 
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The most celebrated quantum algorithm to date is un- 
doubtedly Shor's factoring algorithm. Distilling the cru- 
cial elements in this algorithm which allow for the (as- 
sumed) speed-up it exhibits, may lead to a better under- 
standing of the power of quantum computation in gen- 
eral. Shor's algorithm has two main components: mod- 
ular exponentiation, and the quantum Fourier transform 
(QFT). Of the two the first is basically a classical circuit 
employing classical gates (manipulating computational 
basis states). The only quantum aspect of this circuit is 
that it maintains quantum coherence, i.e. it can act on 
a superposition of classical inputs and yields the corre- 
sponding superposition of classical outputs. The QFT on 
the other hand uses gates that have no classical equiva- 
lents, such as conditional phase and Hadamard gates, and 
is considered as the truly quantum component of the al- 
gorithm. The QFT is a key component not only in Shor's 
algorithm but also in several related quantum algorithms 
such as phase estimation and discrete logarithm. 

Yet, it was recently demonstrated 0,0 that an approx- 
imate quantum fourier transform (which can be used in 
Shor's algorithm 0]) can be efficiently classically simu- 
lated using tensor contraction methods. This was shown 
for any product input state and product state measure- 
ments on the output, and furthermore for a class of en- 
tangled states @ (as input or as basis for output mea- 
surements). This seems to indicate that the computa- 
tional power of Shor's algorithm lies with the modular 
exponentiation circuit. Here, we demonstrate that this 
is indeed the case by the following simple observation: 
Any classical algorithm that can efficiently simulate the 
circuit implementing modular exponentiation for general 
product input states and product state measurements on 
the output, allows for an efficient simulation of the entire 
Shor algorithm on a classical computer. In other words 
the power of Shor's algorithm lies in the ability to imple- 
ment the classical modular exponentiation operation on 
a certain product state input and with measurements in 
a general product state basis. 

The above result is also true for any circuit that can 
replace the modular exponentiation in the overall quan- 
tum circuit for Shor's algorithm (Fig. 1). For instance, 
the multiplication operations used in the log-depth ver- 
sion of the (quantum part of) Shor's algorithm due to 
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A particular method for classically simulating quan- 
tum circuits, in which general product input and output 
states are automatically taken into account, is tensor con- 
traction [B|, Therefore, Our observation implies that 
any tensor contraction scheme that efficiently simulates 
modular exponentiation would be able to simulate Shor's 
factoring algorithm efficiently. 

Let us first describe in exact terms what we consider 
as a simulation of a quantum circuit for product input 
states and product state measurements. For a given 
quantum circuit let us denote a set of (general) single 
qubit measurements on the output by Mi, . . . , M n , their 
corresponding sets of possible outcomes by (mi, . . . , m n ) 
and some specific outcomes of these measurements by 
(n, . . . ,r n ). We say that a quantum circuit can be ef- 
ficiently simulated by a classical computer for product 
state input and product state measurements if for any 
such set of single qubit measurements and any product 
state input there is an efficient classical algorithm for 
calculating the conditional probabilities: 
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where the indices jk correspond to any subset of 

the measurements (Mi, . . . , M n ) including the empty set. 
(Of course, there are exponentially many such condi- 
tional probabilities, one for each value of the bit-string 
Tj\ , . . . , Tj k therefore there is no way to efficiently cal- 
culate all of them; however, we only require that each 
particular conditional probability can be calculated effi- 
ciently.) Sampling from these conditional probabilities 
qubit by qubit one is able, using the classical algorithm, 
to obtain a final outcome with the same probability as it 
would have been obtained by the quantum computer. 

Our definition above for a simulation of a quantum 
circuit is similar to the 'density computation of quan- 
tum circuit' given by Terhal and DiVincenzo Q- The 
difference being the fact that here we allow general prod- 
uct state input and product state measurements at the 
output, whereas in the weaker density computation only 
computational basis input and measurements in the com- 
putational basis at the output are permitted. 

The quantum part of Shor's algorithm, where we wish 
to factor an n qubit integer N, is composed of the fol- 
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FIG. 1: The quantum circuit for Shor's algorithm. The empty 
triangles on the left side represent the computational basis 
input state the shaded triangles on the right represent mea- 
surements in the computational basis and the boxes denoted 
by H stand for Hadamard gates. The QFT together with the 
output measurements can be replaced by the semi-classical 
circuit in Fig. 2. 



lowing steps (Fig. 1). 

1. Initialize two registers, the first with 2n qubits and 
the second with n qubits, in the state 1 0} 1 1 0) 2 ■ 

2. Apply a Hadamard gate on each of the qubits of 
register 1. the state of the computer would now 
be: 



(1+) 
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where 



1/V2(|0) + |1». 



3. Apply modular exponentiation. Namely, apply the 
unitary operation: 



|sb>x|0> 2 -> \x)i\a x modN) 2 , 



(3) 



where a is a randomly chosen number (a < N) co- 
prime with N. 

4. Apply a QFT on the first register. 

5. Measure the first register in the computational ba- 
sis. 

In order to prove our result above we make use of 
the following fact, first demonstrated by Griffith and 
Niu [|| : The QFT circuit followed by measurements in 
the computational basis can always be replaced by a 
'semi-classical' QFT circuit which includes only single 
qubit gates, measurements in the computational basis, 
and feed- forward (without any two qubit gates). Grif- 
fith and Niu observed that when a controlled unitary is 
immediately followed by a measurement in the computa- 
tional basis on the control qubit, the gate operation can 
be implemented by first measuring the control qubit and 
then applying a gate on the second qubit according to 
the outcome (that is, the gate would only be applied if 



FIG. 2: The semi-classical QFT circuit. The triangles de- 
note measurements in the computational basis and the dashed 
lines signify the fact that the application of the different sin- 
gle qubit phase gates (denoted by Pi) is conditioned on the 
outcomes of those measurements. 



the control is measured in the required state). In Shor's 
algorithm the QFT is immediately followed by measure- 
ments in the computational basis (see [lj| for example) 
therefore it can be replaced by the semi-classical circuit 
(shown in Fig. 2). 

Browne Q used the semi-classical QFT to show that 
in the case where the modular exponentiation circuit to- 
gether with Hadamard gates at the input does not pro- 
duce much entanglement, and therefore can be simulated 
classically 
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[12[, Shor's algorithm can be efficiently 
simulated as well (typically modular exponentiation does 
produce highly entangled states [ll|, however there may 
be special cases for which it does not). 

Let us now assume that the modular exponentiation 
circuit can be simulated efficiently when the input is a 
direct product and the output is subjected to single qubit 
measurements. Our simulation of Shor's algorithm pro- 
ceeds via an iterative procedure as follows: 

1. First, we calculate the probabilities Pshor i'm-i) for a 
measurement on the first qubit (in the first register) 
in Shor's algorithm. 

From our assumption it follows that these proba- 
bilities can be efficiently calculated. Indeed, in the 
Shor circuit with the semi-classical QFT (which 
obviously produces the same output as the origi- 
nal Shor algorithm) the measurement of the first 
qubit in the computational basis at the output of 
the semi-classical QFT is nothing else than a single 
qubit measurement of the output of the modular 
exponentiation. (More precisely, it is the measure- 
ment of the first qubit of the output of the mod- 
ular exponentiation in the Hadamard transformed 
basis.) Furthermore, the input state into the mod- 
ular exponentiation is a direct product state (the 
state (|+) • ■ • I -t-) ) 1 1 0) 2 ) so the conditions in our as- 
sumption apply. 

2. We sample from the distribution Pshor{ m i)- Let 
7*1 be the result of this sampling. 
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3. We calculate the conditional probabilities 
Pshor{m2\ri) for the measurement on the second 
qubit in Shor's algorithm, given the outcome r\ 
for the first qubit. 

Again, from our assumption it follows that these 
probabilities can be efficiently calculated. Once 
the output 7"i of the first qubit is fixed, we know 
what is the feed-forward from the first qubit in the 
semi-classical QFT. With this knowledge, the mea- 
surement in the computational basis of the second 
qubit in the circuit for Shor's algorithm becomes 
a well-defined measurement on the second qubit of 
the output of the modular exponentiation circuit. 

4. We repeat steps (2) and (3) for i = 2, . . . 2n. That 
is in the next step we sample from the conditional 
probability distribution Pshor(m2\fi) and obtain 
outcome r2, and in general we calculate and then 
sample from 

Pshor{mi\ri,---,ri^i) for 2 < i < 2n , (4) 

where the basis for the measurement m, is set ac- 
cording to the outcomes of previous measurements 
ri, . . . ,n-i. 

At the end of the process an outcome (fx, ... , rm) is ob- 
tained with the same probability it would have been ob- 
tained by measuring the output of the quantum circuit 
implementing Shor's algorithm. 

Clearly, for the purpose of simulating the Shor al- 
gorithm it is enough to consider only one input state 
(Eq. [2|) . (Note that we could not simply redefine this as 
a new computational basis state |0') if we do not want 
to change the modular exponentiation circuit.) For the 
output measurements, however, we need to take into ac- 
count every possible phase gate. It was noted by Browne 
9] that if we consider the above input state but allow out- 
put measurements only in the computational basis, then 
the modular exponentiation circuit can be simulated ef- 
ficiently. In the same way it is not hard to see that if one 
considers only computational basis input states and al- 
lows any product state measurement at the output, then 
the circuit would be simulable as well. Thus, in some 
sense, our requirements are the minimal ones with which 
quantum advantage is achieved. 

Consider now a tensor contraction method for simula- 
tion [1, 0]. In these methods one associates tensors to 
circuit elements - one and two qubit gates, single-qubit 
input and output states. The latter may correspond to 
outcomes of single qubit measurements or to unmeasured 
output qubits. The rank of the tensors is determined by 
the number of (input and output) qubits on which the 
circuit element operates. That is, the tensor has an in- 
dex for each input or output wire connected to the circuit 
element. Thus, for an input state or an output measure- 
ment correspond to tensors of rank one, single-qubit gates 



correspond to rank two tensors and two-qubit gates are 
represented by rank four tensors. Two circuit elements 
connected by a qubit wire (the output of one is the input 
of the other) share a joint index. A probability for obtain- 
ing a certain outcome at one or more output qubits can be 
calculated by contracting (summing over all indices of) 
all tensors representing the circuit with the appropriate 
configuration at the outputs (tensors corresponding to 
the required outcomes for the measured qubits and ten- 
sors corresponding to unmeasured qubits for the rest). 
The problem with such a contraction process is that the 
number of terms is exponentially large. To avoid this, the 
tensors are contracted one at a time breaking the overall 
sum to a series of separate sums, where in each step two 
existing tensors are replaced with a new tensor obtained 
by summing over joint indices. For instance summing 
over a joint index of a pair of two qubit gates connected 
by a single qubit wire one obtains a tensor of rank six 
(e.g. T$T£% — > Tf™). A tensor contraction simulation 
is efficient (i.e. can be implemented in polynomial time) 
if the tensors generated in the procedure have at most 
O(logn) indices. 

The only factor which determines the complexity of 
a contraction process of a given quantum circuit is its 
topology. The type of gates or their actual operation 
on the input (beyond the fact that it is linear) is irrel- 
evant, furthermore rank one tensors, representing the 
input and output elements, and rank two tensors corre- 
sponding to single-qubit gates, do not affect the topology 
of the circuit and therefore do not affect the efficiency of 
the simulation. For example, these tensors can be in- 
corporated into tensors of neighbouring elements by con- 
tracting them together at the first stage of the simulation. 
This produces new tensors with the same number of in- 
dices as the original neighbouring tensor (in the case of 
one-qubit gates) or less (in the case of input or output 
elements) without changing the graph of connections of 
the circuit. 

From our discussion above it is clear that a tensor con- 
traction simulation for a given quantum circuit with a 
certain set of input and output elements would also work 
for a any other set of input states and output measure- 
ments as long as these are single qubit states and mea- 
surements. Thus, a tensor contraction algorithm simu- 
lating the modular exponentiation would also be able to 
simulate Shor's factoring algorithm. Note that this im- 
plies that it is unlikely that a tensor contraction scheme 
would be able to efficiently compute modular exponenti- 
ation even for 'classical' computational basis input and 
output states, a task which obviously can be done by 
other classical algorithms. 

So far we have discussed only modular exponentiation. 
However from our method of simulation it is clear that 
any quantum circuit with the same structure (as in Fig. 
1), where the modular exponentiation is replaced by some 
other unitary operation U, can be simulated efficiently on 
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a classical computer if U is efficiently simulable for prod- 
uct state input states and measurements, and in partic- 
ular if U has an efficient tensor contraction scheme. 
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